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ABSTRACT 

Research in many areas of modern physics such as, e.g., indirect searches for dark matter and particle accel- 
eration in SNR shocks, rely heavily on studies of cosmic rays (CRs) and associated diffuse emissions (radio, 
microwave, X-rays, 7-rays). While very detailed numerical models of CR propagation exist, a quantitative 
statistical analysis of such models has been so far hampered by the large computational effort that those models 
require. Although statistical analyses have been carried out before using semi-analytical models (where the 
computation is much faster), the evaluation of the results obtained from such models is difficult, as they nec- 
essarily suffer from many simplifying assumptions, The main objective of this paper is to present a working 
method for a full Bayesian parameter estimation for a numerical CR propagation model. For this study, we use 
the GALPROP code, the most advanced of its kind, that uses astrophysical information, nuclear and particle 
data as input to self-consistently predict CRs, 7-rays, synchrotron and other observables. We demonstrate that 
a full Bayesian analysis is possible using nested sampling and Markov Chain Monte Carlo methods (imple- 
mented in the SuperBayeS code) despite the heavy computational demands of a numerical propagation code. 
The best-fit values of parameters found in this analysis are in agreement with previous, significantly simpler, 
studies also based on GALPROP. 

Subject headings: astroparticle physics — diffusion — methods: statistical — cosmic rays — ISM: general — 
Galaxy: general 



1. INTRODUCTION 

A large number of outstanding problems in physics and as- 
trophysics are connected with studies of CRs and the diffuse 
emissions (radio, microwave, X-rays, 7-rays) produced dur- 
ing their propagation in interstellar space. These include: in- 
direct searches for dark matter, the origin and propagation 
of CR; particle acceleration in putative CR sources - such 
as supernova remnants (SNRs) - and the interstellar medium 
(ISM); CR in other galaxies and the role they play in galac- 
tic evolution; studies of our local Galactic environment; CR 
propagation in the heliosphere; and the origin of extragalactic 
diffuse emissions. 

The involved nature of these studies requires reliable and 
detailed calculations. Our current knowledge of CR propaga- 
tion in the Galaxy is based on a large body of observational 
data together with substantial theoretical background: the lat- 
est developments in CR acceleration and transport mecha- 
nisms, detailed maps of the three-dimensional Galactic gas 
distribution, detailed studies of the interstellar dust, radiation 
field, and magnetic field, as well as up-to-date particle and 
nuclear cross section data and codes. However, the number of 
parameters in realistic models incorporating all of this infor- 
mation is large, and using the available data to perform statis- 
tical inference on the models' free parameters is a highly non- 
trivial task. So far, this has only been possible with analytical 
or semi-analytical models where the computation is fast (e.g., 
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Donate et al.1[2002l |Maurin et alJ|200Tl [20021 [20T0] [Putzel 
et al.|2010a| l. But, such models necessarily require many sim- 
plified assumptions to allow the problem to be analytically 
tractable and to reduce the computational load, making the 
estimation of the confidence level of their results difficult. 
More realistic treatments using the analytic approach lead to a 
growing complexity of the formulae, thus removing any com- 
putati onal advantage over the purely numerical approach (see, 
e.g., |Strong et al.|20 07|>. 

TheG~ALPROr{0 code is the most advanced of its kind. 
GALPROP uses astronomical information and other data as 
input to self-consistently predict CRs, 7-rays, synchrotron 
and other observables. The code provides a full numerical 
calculation of the CR spectra and intensities, together with the 
diffuse emissions associated with the CRs interacting with the 
interstellar gas, radiation, and magnetic fields. In this paper 
we introduce the methodology for a complete, fully numerical 
inference for propagation models parameters, using represen- 
tative CR data in a Bayesian statistical framework. We give 
results from a global analysis of CR isotope data, obtained by 
using GALPROP to predict CR spectra and a modified ver- 
sion of the SuperBayeS cod^jto carry out the statistical anal- 
ysis. We demonstrate that improvements to the GALPROP 
code, including parallelization, coupled with highly efficient 
sampling techniques and Bayesian methods now allow a fully 
numerical exploration of the parameter space of the most re- 
alistic models incorporating CRs, 7-rays, etc., as well as ex- 
perimental and theoretical uncertainties. 

The fully Bayesian approach to the problem of deriving 
constraints for CR propagation models parameters has several 
advantages. Firstly, the higher efficiency of Bayesian meth- 
ods allows us to carry out a global statistical analysis of the 
whole parameter space, rather than be limited to scanning a 
reduced number of dimensions at the time. This is impor- 

7 http://galprop.stanford.edu 

8 http://superbayes.org 
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tant in order to be able to fit simultaneously all relevant CR 
parameters and to explore degeneracies. Secondly, we can 
marginalize (i.e., integrate over) the parameters one is not in- 
terested in at almost no additional computational costs, thus 
obtaining probability distributions for the parameters of inter- 
est that fully account for correlations in the global parameter 
space. Thirdly, our method returns not only a global best fit 
point, but also statistically well-defined errors on the parame- 
ters, which is one of the most important achievements of this 
work. Finally, we are able to include in our analysis a large 
number of "nuisance" parameters (such as modulation poten- 
tials and experimental error rescaling parameters, see below 
for details) that mitigate the impact of potential systematic er- 
rors in the data and/or in the theoretical model, thus making 
our fits much more robust. Bayesian inference however re- 
quires to choose priors for the parameters involved. This is 
done very carefully in the present work, and we demonstrate 
below that our results do not depend strongly on the choice 
of priors, which again is an hallmark of a robust statistical 
analysis. 

2. COSMIC-RAY PROPAGATION IN THE GALAXY 

Here we provide a brief overview of CR production and 
propagation, more information can be found in a recent review 



while the observed abundance of radioactive CR isotopes 
(4°Be, i®Al, 17CI, 25 Mn ) allows the sep arate determination 
of halo size and diffusion coefficient (e.g., Ptuskin & Soutoul 



b y l Strong et aT1 ( |2007] l. 

The sources of CRs are believed to be supernovae (SNe) 
and SNRs, superbubbles, pulsars, compact objects in close 
binary systems, and stellar winds. Observations of X-ray 
and 7-ray emission from these objects reveal the presence 
of energetic particles, thus testifying to efficient acceleration 
processes in their neighborhood. Particles accelerated near 
the sources propagate tens of millions years in the interstel- 
lar medium (ISM) while their initial spectra and composition 
change. The destruction of primary nuclei via spallation gives 
rise to secondary nuclei and isotopes that are rare in nature, 
antiprotons, and charged and neutral pions that decay produc- 
ing secondary positrons, electrons, and 7-rays. 

Modeling CR propagation in the ISM includes the solu- 
tion of the partial differential equation describing the transport 
with a given source distribution and boundary conditions for 
all CR species. The diffusion-convection equation, sometimes 
incorporating diffusive reacceleration in the ISM, is used for 
the transport process and has proven to be remarkably suc- 
cessful despite its relative simplicity. For CR nuclei, relevant 
processes during propagation include nuclear spallation, pro- 
duction of secondary particles, radioactive decay, electron Re- 
capture and stripping, with the energy losses due to ionization 
and Coulomb scattering. For the propagation of CR electrons 
and positrons, spallation, radioactive decay, etc., are not rele- 
vant, while the energy losses are due to ionization, Coulomb 
scattering, bremsstrahlung (with the neutral and ionized gas), 
inverse Compton (IC) scattering, and synchrotron emission. 

Measurements of stable and radioactive secondary CR nu- 
clei yield the basic information necessary to probe large-scale 
Galactic properties, such as the diffusion coefficient and halo 
size, the Alfven velocity and/or the convection velocity, as 
well as the mechanisms and sites of CR acceleration. Know- 
ing the number density of primary nuclei from satellite and 
balloon observations, the production cross-sections from ac- 
celerator experiments, and the gas distribution from astro- 
nomical observations, the production rate of secondary nuclei 
can be calculated within the context of a given propagation 
model. Stable secondary CR nuclei (e.g., 5B) can be used 
to determine ratio of halo size to the diffusion coefficient, 



1998, Strong & Moskalenko 1998; Webber & Soutoul 199 
Moskalenk o et al.||2001[ ). However, the interpretation of the 
sharp peaks observed in the secondary to primary CR nuclei 
ratios (e.g., sB/gC, [2iSc+22Ti+23V]/26Fe) at relatively low 
energies, ~ 1-few GeV/nucleon, is model-dependent. 

The solar modulation of the CRs during their propagation 
in the heliosphere significantly modifies the interstellar spec- 
tra below ~ 20 GeV/nucleon. The modulated spectra are the 
actual ones measured by balloon-borne and spacecraft instru- 
ments. Modulation models are based on th e solution of the 
|Parker| (fT9 65) equation (e.g., see reviews by Potgieter| |1998 



|Heber et al. 2006). The particle transport to the inner helio- 
sphere is mainly determined by spatial diffusion, convection 
with the solar wind, drifts, and adiabatic cooling. Realistic 
time-dependent three-dimensional hydrodynamic mo dels in- 
corporating these effects have been developed (e.g.,[FTo rinski 
et al.|2003[|Langner et al.|2006HPotgieter & Langner|2004f 



The often-used method ot jGleeson & Axford| ( |1968| l, the so- 
called "force-field" approximation, employs a single parame- 
ter - the "modulation potential" - to characterize the strength 
of the modulation effect on the CR spectra as it varies over the 
solar cycle. The force-field approximation has no predictive 
power as the modulation potential depends on the assumed in- 
terstellar spectrum of CR species. However, it can be a useful 
low-energy parameterization for a given interstellar spectrum. 

Closely connected with the CR propagation is the produc- 
tion of the Galactic diffuse 7-ray emission. This is com- 
prised of three components: 7r°-decay, bremsstrahlung, and 
IC. Cosmic-ray nuclei interacting inelastically with the inter- 
stellar gas produce 7r°s that decay to 7-rays. The CR elec- 
trons and positrons interact with the interstellar gas and pro- 
duce bremsstrahlung, and with the interstellar radiation field 
(ISRF) via IC scattering producing 7-rays. Since the 7-rays 
are undeflected by magnetic fields a nd absorption in the ISM 
is negligible (Moskalenko et al. 2006), they provide the infor- 
mation necessary to directly probe CR spectr a and intensities 
in distant locations (see Moskalenko et al. 2004 , for a review). 
However, the interpretation of such observations is compli- 
cated since the observed 7-ray intensities are the line-of-sight 
integral of a sum of the three components of the diffuse Galac- 
tic 7-ray emission, an isotropic component (often described as 
"extragalactic", but this not completely certain), unresolved 
sources, together with instrumental background(s). Proper 
modeling of the diffuse 7-ray emission, including the disen- 
tanglement of the different components, requires well devel- 
oped models for the ISRF and gas densities, together with the 
CR propagation (see, e.g. JStrong et al.|2000| |2004| >. For re- 
cent measurements of the diffuse 7-ray emission b y the Ferm i 
Large Area Telescope (LAT), see |Abdo et aL] ( |2009a|d| [2010l > . 
Global CR-related properties of the Milky Way galaxy are cal- 
culated in |Strong etaL| ( |20 1 0| > . 

3. GALPROP CODE 

The GALPROP project began in late 1996 and has now 15 
years of development behind it. The key concept underlying 
the GALPROP code is that various kinds of data, e.g., direct 
CR measurements including primary and secondary nuclei, 
electrons and positrons, 7-rays, synchrotron radiation, and so 
forth, are all related to the same astrophysical components of 
the Galaxy and hence have to be modeled self-consistently. 
The code, originally written in FORTRAN90, was made pub- 
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lie in 1998. A version rewritten in C++ was produced in 2001, 
and the most recent pub lic version 54 was recently released 
( |Vladimirov et al.|2010) >. The code is available from the ded- 
icated website where a facility for users to run the code via 
online forms in a web-browser is also providecj^] 

The GALPROP code solves the CR transport equation with 
a given source distribution and boundary conditions for all CR 
species. This includes a galactic wind (convection), diffusive 
reacceleration in the ISM, energy losses, nuclear fragmenta- 
tion, radioactive decay, and production of secondary particles 
and isotopes: 



formalism for an anisotropic radiation field developed by 
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where ip = ip(r,p,t) is the density per unit of total parti- 
cle momentum, ip(p)dp = 47rp 2 /(p) in terms of phase-space 
density /(p), q(r,p) is the source term, D xx is the spatial 
diffusion coefficient, V is the convection velocity, reaccel- 
eration is described as diffusion in momentum space and is 
determined by the coefficient D pp , p = dp/dt is the momen- 
tum loss rate, r/ is the time scale for fragmentation, and r r 
is the time scale for radioactive decay. The numerical solu- 
tion of the transp ort equation is based on a Crank-Nicholson 
( [Press et al.||1992[ ) implicit second-order scheme. The spa- 
tial boundary conditions assume free particle escape, e.g., 
ip(Rh,z,p) = ip(R,±Zh,p) = 0, where Rh and Zh are the 
boundaries for a cylindrically symmetric geometry. 

Since the grid involves a 3D (R,z,p) or 4D (x,y,z,p) 
problem (spatial variables plus momentum) "operator split- 
ting" is used to handle the implicit solution. For a given 
halo size the diffusion coefficient, as a function of momen- 
tum and the reacceleration or convection parameters, is de- 
termined from secondary/primary ratios. If reacceleration is 
included, the momentum-space diffusion coeffi cient D vv is 
related to the spatial coefficient D x 
|et al|1990[[Seo & Ptuskin|1994l i: 
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where w characterizes the level of turbulence (we take w = 
1 since only the quantity v^/w is relevant), and 6 = 1/3 
for a Kolmogorov spectrum of interstellar turbulence or 5 = 
1/2 for a Kraichnan cascade (but can also be arbitrary), p 
lie 7 1 is the magnetic rigidity where p is momentum and Ze 
is the charge, Dp is a constant, an d /3 = v/c. Non-linear 
wave damping (P tuskin et al.|2006j ) can also be included via 
specifying parameters in the configuration galdef file. 

Production of seco ndary positrons and elec t rons i s calcu- 
lated as described in |Moskalenko & Strong ( 1998 1 with a 



correction by Kelner et al. ( |2006| l. Seco ndary pion produc - 
tion is calculated using th e formalism b y Dermer] ( 1986b|a|l, 



which combines isobaric (Stecker 1970) and scaling (Bad- 



hwar et al.||1977||Stephens & Badhwar||i981|l models oT tEe 
reaction, as described in Moskalenko & Strong (|1998|l, o r 
using a parameterization developed by JKamae et al.|"f 2005 ). 
Brems strahlung is calculated as described in |Strong et al.| 
(2000). The IC scattering is treated using the appropriate 

9 http://galprop.stanford.edu/webrun 



Moskalenko & Strong (2000) using the full spatial and an- 
gular distribution of the ISRF calculated usin g the FRaNKIE 
code ( [Porter & StrongJ2005 1 [Porter et al.|2008] >. 

The distribution of Galactic CR sources is based on Fermi- 
LAT 7-ray data. For this study, we use fcn{R) — 
(R/R ) a e~ p( ~ R ~ Ro \ i.e., normalized to 1 at R = R , where 
a = 1.25, and (3 — 3.56. The profile is flattened for the outer 
Galaxy compared to earlier parameterizations used, as sug- 
gested from recent F ermi studies of the 2nd Galactic quadrant 
( [Tibaldo et al.|2009| >. 

The 7-rays are calculated using the propagated CR distribu- 
tions, including a contribution from secondary particles such 
as positrons and electrons from inelastic processes in the ISM 
that increases the 7-ray flux at MeV energies (Stron g et al. 
|2004[ |Porter et aL][2008 ). Gas-related 7-ray intensities (7r u - 



decay, bremsstrahlung) are computed from the emissivities as 
a function of (R, z, E^) using the column densities of H I and 
H2 for galactocentric annuli based on recent 21 -cm and CO 
survey data with a more accurate assignment of the gas to the 
Galactocentric rings than earlier versions. The synchrotron 
emission is computed using the Galactic magnetic field model 
that can be chosen from among various models taken from 
the literature, suitably parameterized to allow fitting to the 
observations. The line-of-sight integration of the correspond- 
ing emissivities with the distributions of gas, ISRF, and mag- 
netic field yields 7-ray and synchrotron skymaps. Spectra of 
all species on the chosen grid and the 7-ray and synchrotron 
skymaps are output in standard astronomical f ormats for di- 
rect c omparison with data, e.g., HEALPi^ (Gorski et al. 
2005) 1, Fermi-LKT MapCube format for use with LAT Sci- 



ence Tools softwarj^l etc. 

Cross-sections are based on the extensive LANL database 
nuclear codes, and parameterizations (Mas hnik et al.|[2004 



The most important isotopic production cross-sections (2H, 
3H, 3He, Li, Be, B, Al, CI, Sc, Ti, V, and M n) are calcu- 
lated using o ur fits to maj or production channe ls {Moskalenko 
|& Mashnik|[2003) |Moskalenko et al.|[2003] i. Other cross- 
sections are calculated using pheno menological approxima- 
tions by I Webber et~aT[ ( pOOl) and/or [SiTberberg et al.| ( [T998] l 
renormalized to the data where they exist. The nuclear reac- 
tion network is built using the Nuclear Data Sheets. 

The GALPROP code computes a complete network of pri- 
mary, secondary and tertiary CRs production starting from in- 
put source abundances. Starting with the heaviest primary nu- 
cleus considered (e.g. 64 Ni) the propagation solution is used 
to compute the source term for its spallation products A — 1, 
A — 2 and so forth, which are then propagated in turn, and 
so on down to protons, secondary electrons and positrons, 
and antiprotons. To account for some special /3~ -decay cases 
(e.g., 10 Be— > 10 B) the whole loop is repeated twice. GAL- 
PROP includes K-capture and electron stripping processes as 
well as knock-on electrons. The inelastically scattered pro- 
tons and antiprotons are treated as separate components (sec- 
ondary protons, tertiary antiprotons). In this way secondaries, 
tertiaries, etc., are included. Primary electrons are treated sep- 
arately. 

Further details on improvements to the code, including par- 
allelization and other optimizations, improvements in line-of- 
sight integration routines, and so forth, can be found at the 

10 http://healpix.jpl.nasa.gov 

1 1 http://fermi.gsfc.nasa.gov/ssc/data/analysis 
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aforementioned website. 

4. METHODOLOGY 

4. 1 . Bayesian Inference 

The goal of this paper is to determine constraints on the 
propagation model parameters (introduced below) from ob- 
served CR spectra and we adopt a Bayesian approach to pa- 
rameter inference (see e.g. |Trotta|[2008| for further details). 
Bayesian inference is based on the posterior probability dis- 
tribution function (pdf) for the parameters, which updates our 
state of knowledge from the prior by taking into account the 
information contained in the likelihood. A recent application 



to CR propagation m odels is given in M aurin et al.| ( |2010p and 
Putze et al. (2010a I. Denoting by 8 the vector of parameters 
one is interested in constraining, and by D the available ob- 
servations, Bayes Theorem reads 



P(6|D) 



P(D\@)P(Q) 



(3) 



where P(0|D) is the posterior distribution on the parame- 
ters (after the observations have been taken into account), 
P(D|0) — £(6) is the likelihood function (when consid- 
ered as a function of G for the observed data D) and P(0) is 
the prior distribution, which encompasses our state of knowl- 
edge about the value of the parameters before we have seen 
the data. Finally, the quantity in the denominator of eq. |3]l 
is the Bayesian evidence (or model likelihood), a normaliz- 
ing constant that does not depend on O and can be neglected 
when interested in parameter inference. The evidence is ob- 
tained by computing the average of the likelihood under the 
prior (so that the r.h.s. of eq. (3) is properly normalized), 

P(D) = J P(D|e)P(6)d6. (4) 

The evidence is the prime quantity for Bayesian model com- 
parison, which aims at establishing which of the available 
models is the "best" one, i.e., the one that fits the data best 
while being the most economical in terms of parameters, thus 
giving a quantita tive implementation of Occam's razor (see, 
e.g., [Trotta 2007). The evidence ca n also be used to a ssess 
the constraining power of the data (Trotta et al.||2008J and 



:>les 



^eroz 



to carry ou t consistency checks between observa 
|et al.|2009| l. 

Together with the model, the priors for the parameters 
which enter Bayes' theorem, eq. (Bp, must be specified. Priors 
should summarize our state of knowledge and/or our theoreti- 
cal prejudice about the parameters before we consider the new 
data, and for the parameter inference step the prior for a new 
observation might be taken to be the posterior from a previous 
measurement (for model comparison issues the prior is better 
understood in a different way, see Trotta 2008]!. 

The problem is then fully specified once we g ive t he like- 
lihood function for the observations (see section |4~4| below). 
The posterior distribution P(6|D) is determined numerically 
by drawing samples from it and Markov Chain Monte Carlo 
(MCMC) techniques can be used for this purpose. In this pa- 
per we use both Metropolis-Hastings MCMC and the Multi- 
Nest algorithm, which implements nested sampling and pro- 
vides a higher efficiency, guarantees a better exploration of 
degeneracies and multimodal posteriors, and computes the 
Bayesian evidence as well (which is difficult to extract from 
MCMC methods). 



4.2. Propagation Model Parameters 

As a test case for this study we choose the diffusion- 
reacceleration model, which has been used in a number of 



studies utilizing the GALPROP code (e. g., Moskalen ko et al. 
2002) [Strong et al.||2004t |Ptuskin et aLl|2006l |Abdo et al. 
2009a and references therein). The source distribution is 
specified in Section [3] 
In this model the spatial diffusion coefficient is given by 



D xx = I3D 



Pa 



(5) 



where Dq is a free normalization at the fixed rigidity po — 
4 x 10 3 MV. The power-law index is 6 = 1/3 for Kolmogorov 
diffusion (see Section[3]l, but we take it as a free parameter for 
the purposes of this study. Fitting the B/C ratio below 1 GeV 
in reacceleration models is known to require large values of 
«Aif- In these models, a break in the injection spectra is re- 
quired to compensate for the large bump in the propagated 
spectra at low energies/nucleon. Therefore, the CR injection 
spectrum is modeled as a broken power-law, with index below 
(— v{) and above {—v%) the break as free parameters, but with 
the location of the break fixed at a rigidity 10 4 MV. The other 
free model parameters are «Aif, the halo size Zh, and the nor- 
malization of the propagated CR proton spectrum at 100 GeV 
N p , for a total of 7 free model parameters, as summarized in 
Table [T] Other models discussed in the literature may be able 
to reproduce the B/C ratio without a break in the injection 
spectra, but the present paper is mainly intended as a presen- 
tation of the method and we defer a comprehensive study of 
different possibilities to a forthcoming paper. 

The nuclear chain used starts at 28 Si and proceeds down to 
protons. The source abundances of nuclei 6 > Z > 14 have 
an important influence on the B/C and 10 Be/ 9 Be ratios used 
in this study. In our analysis, they are fixed at values deter- 
mined for A CE data at a few 100 MeV/nucleon (Mos kaIenko| 
|et al.| [2008 ), but the values are assumed to hold also at the 
GALPROP normalization energy of 100 GeV/nucleon. The 
adopted relative source abundances of the most abundant iso- 
topes (for particle flux in cm~ 2 s -1 (MeV/nucleon) -1 ) are: 
4 He: 7.199 x 10 4 , 12 C: 2819, 14 N: 182.8, ie O: 38 22, 20 Ne: 
312.5, 22 Ne: 100.1, 23 Na: 22.84, 24 Mg: 658.1, 25 Mg: 82.5, 
26 Mg: 104.7, 27 A1: 76.42, 28 Si: 725.7. These values are rela- 
tive to the proton normalization N p for a proton source abun- 
dance 1.06 x 10 6 , but this is only formal since the antiprotons, 
secondary positrons and gamma rays were computed from an 
independent fit to proton and He data. N p is used only to nor- 
malize C and O to the data, via the ratios given above (other 
data like N are not used explicitly). 

Special handling is required to treat the solar modulation of 
the propagated CR spectra, for which we introduce an extra 
nuisance parameter for each of the data set we consider. The 
motivation and choice of the Gaussian priors, with mean and 
stan dard deviation as given in Table [T] is described in Sec- 
tion |43] In addition, we also introduce a set of parameters r 
designed to mitigate the possibility that the fit be dominated 
by undetected systematic errors in the data, as explained in 
detail in the next section. Overall, we thus fit a total of 16 
free parameters, including 7 model parameter, 4 modulation 
parameters, and 5 observational variance rescaling factors. 
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TABLE 1 

Summary of input parameters and prior ranges 



Quantity Symbol Prior range Prior type 

Diffusion model parameters 

Diffusion coefficient" (10 28 cm 2 s _1 ) 
Rigidity power law index 
Alfven speed (km s _1 ) 
Diffusion zone height (kpc) 
Nucleus injection index below 10 4 MV 
Nucleus injection index above 10 4 MV 
Proton normalization (10~ 9 cm 2 sr" - 1 s- 1 MeV- 1 ) 

Experimental nuisance parameters 

Modulation parameters <fi (MV) 
HEAO-3 
ACE 
CREAM 
ISOMAX 
ATIC-2 

Variance rescaling parameters (j = 1, , 5) 

a At p = 4 X 10 3 MV. 

b We use the notation Af((i, a) to indicate a Gaussian distribution of mean and standard deviation a. 



D 
S 

''Air 



[1,12] 
[0.1,1.0] 

[0, 50] 
[1.0,10.0] 
[1.50,2.20] 
[2.05.2.50] 

[2,8] 



Uniform 
Uniform 
Uniform 
Uniform 
Uniform 
Uniform 
Uniform 



m HEAO-3 

m-ACE 
TlCREAM 
m ISOMAX 
'"1ISOMAX 
log Tj 



420, 780 
175,475 
420, 780 
370.490 


-1.5,0.0] 



Gaussian prior b 
A^(600,60) 
JV(325,50) 
Af(600,50) 
Af(430,20) 
Fixed (no modulation) 
Uniform on log Tj 



4.3. Cosmic Ray Data and Modulation 

For demonstration of the method we use the most accurate 
CR data sets available preferably taken near solar minimum^] 
The B/C ratio is well-measured by a number of space- 
and balloon-borne missions. The HEAO-3 data (Engelmann| 



et al. 1990) remain the most accurate to date in the energy 
range 0.6-35 GeV/nucleon and have been recently confirmed 
by PAMELA (R. Sparvoli, private comm.). At higher ener- 
gies, from 30 Ge V/nucleon - 1 TeV/nuc leon, we u se ATIC-2 
( |Panov et al.|2008) and CREAM- 1 da ta (|Ahn et al.|2008). A t 
low ener gies, the Voyagers 1 an d 2 ( |Lukasiak et al.|| 1999), 
Ulyss es ( |Duvernois et al.|[T9"96) , and ACE ( |de JSIolfo et al.| 
2006) 1 data agree with each other, while the ACE data (50- 
200 MeV/nucleon) have the smallest statistical error. There- 
fore, we use the ACE measurements corresponding to the so- 
lar minimum conditions ( |George et al.|20 09 ). 

The 10 Be/ 9 Be ratio is most accurately measured (70-145 
MeV/nucleon) by ACE (Yan asak et al.|200T) , which we in- 
clude in our fit. Those measurements are in agreement with 
Voyagers 1 and 2 ( |Lukasiak et al.|1999| l, and Ulysses ( |Con-| 
nell 1998) data. At higher energies (per nucleon) ther e are 
only two data points by ISOMAX plains et al]|2004| ) with 
very large error bars, which we however include in the fit. 

We also use the carbon and ox ygen spectra as m easured 
by ACE at the solar minimum (George et al. 2009) and by 
HEAO-3 HEngelmann etal.|1990) . 

As mentioned above, a very important issue is the treat- 
ment of the heliospheric modulation. We fit to the CR data in 
the whole energy range from some 10 MeV/nucleon to TeV 
energies. However, a comparison of calculated CR spectra, 
the elemental and isotopic ratios with low-energy data (be- 
low ^20 GeV/nucleon) measured inside the heliosphere re- 
quires care as the calculated spectra depend significantly on 
the treatment of the heliospheric modulation. As mentioned 
in section [2] the modulation can be realistically treated with 
full 3-dimensional models, but application of such models 
to the current study does not seem feasible since the num- 
ber of free parameters and the computing requirements would 



12 Most of the data are o btained via the database maintained at 
http://www.mpe.mpg.de/~aws/propagate.html 



considerably increase. Currently, it is only possible to us e 
a simple force-field approximation ( |Gleeson'& Axford 1968 ), 
which is characterized with the value of the modulation poten- 
tial. However, directly using the modulation potentials from 
different experiments is problematic because they cannot be 
interpreted independently from the experiments themselves. 
The derived values depend on the choices of interstellar spec- 
tra used for their analyses, which differ from experiment to 
experiment (and are sometimes not provided). 

To deal with this type of uncertainty, instead of fixing a col- 
lection of a priori values for the modulation potential, we al- 
low some flexibility to the fits and include the modulation po- 
tentials as free nuisance parameters in our inference, with one 
free parameter per experiment (i.e, ACE, HEAO-3, ISOMAX 
and CREAM- 1). To avoid unphysical/implausible values, we 
adopt Gaussianpriors with mean and standard deviation as 
given in Table [T] which are motivated by the estimated ball- 
park values of the modulation by the experimentalists. Notice 
that no modulation parameter is given for ATIC-2 as we only 
use high-energy data for that experiment and modulation is 
not relevant. 

4.4. The Likelihood Function 

For a given set of the CR model parameters O and the mod- 
ulation potential parameters <f> (where (f> = {</>!, . . . , fa}, with 
a different choice of the modulation potential for each data set 
employed) we can compute via GALPROP the ensuing CR 
spectrum, as a function of energy, <E>x (E, 0, <f>) for species 
X. Assuming Gaussian noise on the observations, we take 
the likelihood function for each observation of species X at 
energy Ei to be of the form 



(6) 



2ff (Tj 



exp 



where §x(Ej, 0,<j>) is the prediction from the CR propaga- 
tion model for species X at energy E^ $jj is the measured 
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spectrum, and cry is the reported standard deviation. The 
sub/superscript i runs through the data points within each of 
the data sets j. We assume bins to be independent, so that the 
full likelihood function is given by the product of terms of the 
form given above: 



A., 



F(Die^)=nn p ( <| xi0. 



(7) 



However, a careful analysis of a plot of the data points for 
each CR species reveals that there are fairly strong discrepan- 
cies between different data sets. This might point to either an 
underestimation of the actual experimental error bars or to un- 
detected systematic errors between data sets. If some or all of 
the reported error bars are significantly underestimated, this 
would lead to a handful of data points incorrectly dominating 
the global fit, introducing systematic bias in the reconstructed 
value of the parameters. To mitigate against undetect ed sys- 
tematics, we follow the procedure described in, e.g., Barnes 
|et al.| ( |20"0"3j ). For each data set we introduce in the likelihood 
a parameter Tj {j = 1, . . . , 5), whose function is to rescale 
the variance of the data points in order to account for possible 
systematic uncertainties. Therefore, eq. (|6]i is modified: 



p($«|e, 



(8) 



■ exp 



V 



The role of the set of parameters r = {n, . . . , T5}, which we 
call "error bar rescaling parameters", is to allow for the pos- 
sibility that the error bars reported by each of the experiments 
underestimate the true noise. We then add t to the set of pa- 
rameters 9 and sample over it, too, thus allowing the data 
themselves to decide whether there are systematic discrepan- 
cies in the reported error bars. A value Tj < 1 means that the 
data prefer a systematically larger value for the errors for data 
set j. Notice that t< not only appears in the exponential of the 
Gaussian in Eq. ((8j, but also in the pre-factor, which, being 
proportional to y/fj, ensures that Tj never attains a value arbi- 
trarily close to (implying infinite error bars). Furthermore, 
the variance scaling parameter r also takes care of all aspects 
of the model that are not captured by the reported experimen- 
tal error: this includes also theoretical errors (i.e., the model 
not being completely correct), errors in the cross section nor- 
malizations, etc. 

4.5. Choice of Priors 

The full posterior distribution for the CR propagation model 
parameters 9, the variance rescaling parameters r and the 
modulation parameters <f> is given by 

P(B, <j>, t, |D) oc P(D|0, <f>, t)P{Q)P(t)P(4>), (9) 

where the likelihood P(D|9, r, <j>) is given by Eq. <jvj and dSJ. 

The priors P(9), P{<fi) and P(t) in Eq. Q are specified 
as follows. We take the prior on a set of model parameters, 
P(9), to be uniform on 9 with ranges as given in Table m 
As shown below, the posterior is reasonably well constrained 
and close to Gaussian for 9, hence we expect our results to 
be fairly independent of the prior choice. 



This conclusion is strengthened by the inspection of the 
profile likelihood, which is obtained from our samples by 
maximizing the value of the likelihood along the hidden di- 
mensions rather then integrating over the posterior. The pro- 
file likelihood statistics is independent of the priors, provided 
the parameter space has been sampled with sufficient resolu- 
tion, and thus it constitutes a cross-check for the presence of 
large volume effects coming from the priors. Such volume 
effects are typically important when the priors play a major 
role in the inference, while they are usually negligible when 
the posterior is dominated by the likelihood (in which case 
the prior influence is minimal, see e.g. [Trotta et al.|2008| for 
an illustration). We have found the profile likelihood to be in 
excellent agreement with the posterior pdf presented below, 
and therefore we do not consider it further in our results be- 
low. This means that the prior influence is small, and our re- 
sults can be considered to be robust with respect to reasonable 
changes in priors. 

Regarding the modulation parameters, we adopt a Gaussian 
prior on each of them, informed by the values reported by 
each experiment (see Table [TJ, in order to avoid physically 
implausible values. The descrip tion of the experimental CR 
data sets can be found in Section [43] 

The Tj are scaling parameters in the likelihood, and thus the 
appropriate prior is given by the Jeffre ys ' prior, which is uni 



form on log Tj (see Barnes et al.|2003 or Jaynes & Bretthorst 



2003 for a justification). Therefore, we adopt the proper prior 



_ / 2/3 for for - 3/2 < log 7} < 
otherwise 



that corresponds to a prior on Tj of the form 

P{t 3 )^t-\ 



(10) 



(11) 



The inclusion in our analysis of the nuisance parameters 
<f) and t (which are then marginalized over, see Eq. fT7) ) 
has two main effects on our inference about the CR param- 
eters of interest, 9. Firstly, it increases the robustness of our 
fit, since the nuisance parameters account for potential sys- 
tematic effects in the data (t) and approximately capture the 
impact of solar modulation on the measurements (<j>). Sec- 
ondly, it makes our CR constraints more conservative, since 
our marginalized errors on 9 fully account for all possible 
values of the nuisance parameters compatible with the data. 

4.6. Sampling Algorithm 

A powerful and efficient alternative to classical MCMC 
methods has emerged in the last few years in the form of 
the so-ca lled "nested sampling" algorithm, invented by John 
Skilling (|Skilling|[2004l [2006 ; Fer oz & Hob"son|2lX)8l iFeroz] 
|et al.|2009[ ). Although the original motivation for nested sam- 
pling was to compute the evidence integral of eq. (p), the re- 
cent development of the M ultiNest algorithm (Feroz & Hob- 
son|2008 Feroz et al.|2009[ ) has delivered an extremely pow- 
erful and versatile algorithm that has been demonstrated to 
be able to deal with extremely complex likelihood surfaces in 
hundreds of dimensions exhibiting multiple peaks. 

As samples from the posterior are generated as a by-product 
of the evidence computation, nested sampling can also be 
used to obtain parameter constraints in the same run as com- 
puting the Bayesian evidence. In addition, multi-modal nested 
sampling exhibits an efficiency that is almost independent of 
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the dimensionality of the parameter space being explored, 
thus beating the "curse of dimensionality". 

The essential element of nested sampling is that the multi- 
dimensional evidence integral is recast into a 1 -dimensional 
integral. This is accomplished by defining the prior volume 
X as dX = P(9)d9 so that 



X(X) 



P(9)d9 



(12) 



£(6)>A 



where the integral is over the parameter space enclosed by the 
iso-likelihood contour £(9) = A. So X(X) gives the volume 
of parameter space above a certain level A of the likelihood. 
Then the Bayesian evidence, Eq. Q, can be written as 



P(D) = / C(X)dX, 



(13) 



where C(X) is the inverse of Eq. ( p~2] >. Samples from C{X) 
can be obtained by drawing uniformly samples from the likeli- 
hood volume within the iso-contour surface defined by A. The 
1 -dimensional integral of Eq. ( fT3] i can be obtained by simple 
quadrature, thus 



P(D)«^£(X, 



Wi. 



X. 



(14) 



i±2a 



(see 



Skilling 



where the weights are Wj = h (X ; 

20041 [20061 IFeroz & Hobson| |2008j |Feroz et ah| |2L)09 
Mukherjee et al.|2006| for details). It has been shown in the 



context of CMB data analysis in cosmology and in supersym- 
metry phenomenology studies that this technique reduces the 
number of likelihood evaluations by over an order of magni- 
tude with respect to conventional MCMC. 

In this pa per we adopt the pub licly available MultiNest al- 
gorithm ( Feroz & Hobson 2008 ), as implemented in the Su - 
perBayeS code ( jTrotta et al!|2"0 08 ; Ruiz de Austri et al. 2006 ), 
that we have interfaced with the GALPROP code. First, 
we performed an exploratory scan with MultiNest, adopting 
4000 live points in our 16-dimensional parameter space, with 
the aim of scouting the structure and degeneracy directions of 
the posterior distribution. An important chracteristic of Multi- 
Nest, which sets it apart from conventional MCMC methods, 
is its ability to sample reliably from multi-modal distributions. 
Therefore, it is highly desirable to employ MultiNest to per- 
form scans of parameter spaces that have not been investi- 
gated before, as MultiNest will make it less likely to miss im- 
portant substructure in the probability distribution when the 
latter is multi-modal. 

Our exploratory MultiNest scan gathered ~ 10 5 samples 
from the posterior, with an overall efficiency of about 10% 
and a total computational effort of ss 13 CPU years. This 
initial scan revealed a well behaved, unimodal distribution 
over the prior ranges given in Table [T| We then computed 
the parameter-set covariance matrix and adopted this as a 
Gaussian proposal distribution for a conventional Metropolis- 
Hastings MCMC scan. Since the proposal distribution was 
well matched to the posterior, our MCMC scan reached an ef- 
ficiency of ~ 15%, and we built 10 parallel chains with 14000 
samples each (after burn-in), for a total of 1.4 x 10 5 samples. 
We checked that the Gelman & Rubin mixing criterion ( [Gel- 
man & Rubin|1992 1 is satisfied for all of our parameters (i.e., 
R <C 0.1, where R is the inter-chain variance divided by the 
intra-chain variance). 



We verified that the posterior distribution obtained with 
MCMC was in excellent agreement with the one obtained 
from MultiNest, which validates our results as the two sam- 
pling schemes are completely different. The results presented 
in this paper are obtained from the MCMC run, which allows 
for slightly smoother posterior distributions as it contains 40% 
more samples than the MultiNest scarp] 

In order to keep the computational cost within reasonable 
limits, we carry out our MultiNest and MCMC runs assum- 
ing a relatively coarse spatial and energy/nucleon grid for the 
CR propagation. For each point in parameter space, we adopt 
5z = 0.2 kpc, SR = 1 kpc, and SE = 1.4. With these pa- 
rameters, one likelihood evaluation takes approximately 15 s 
on an 8-way 2.4 GHz Opteron CPU machine. We then repro- 
cessed the MCMC samples using importance sampling, de- 
creasing the spacing of the spatial grid by a factor of 2 in each 
direction while increasing the energy resolution to 6E = 1.1. 
The computational cost per likelihood evaluation is increased 
by a factor of ~ 20, but allows a more precise computation 
of the CR spectra. The statistical distribution is adjusted ac- 
cordingly, thus obtaining a posterior distribution that is close 
to what would have been obtained by running the scan at the 
higher resolution initially. The advantage of using posterior 
sampling in this context is that the resampling of the points 
can be done in a massively parallel way (using 800 CPUs, the 
resampling step takes only a few hours). Although the best- 
fit point shifts somewhat after importance sampling, we have 
verified that the bulk of our probability distributions remain 
stable. We therefore conclude that the results presented here 
are robust with respect to increases in the spatial and energy 
resolution of the scan. 



4.7. Parameter Inference from Posterior Samples 

Once a sequence {S^ -*, . . . , 9^ M )} of samples from 
the posterior pdf has been gathered, obtaining Monte Carlo 
estimates of expectations for any function of the parameters 
is straightforward. For example, the posterior mean is given 
by ((•) denotes the expectation value with respect to the pos- 
terior) 



(9) 



M-l 



p(e|D)6de = — J2 0(t) . C 15 ) 



4=0 



where the second equality follows because the samples QW 
are generated from the posterior by construction. In general, 
the expectation value of any function of the parameters /(9) 
is obtained as 



(/(e)) 



i 



M-l 



E /( 0(t) )- 



(16) 



t=o 



It is useful to summarize the results of the inference by giving 
the 1-dimensional marginal probability for the j th element of 
9, 9j, obtained by integrating out all other parameters from 
the posterior: 



13 The resulting chain of samples (including their statistical weight and 
likelihood) is provided as Supplementary Material, allowing the reader to 
make their own analysis, for example to investigate particular parameter cor- 
relations. 
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p(e 3 -|D) 



p(e|D)de 1 . . . de i _ide i+ i . . . de„, 



(17) 

where P(0,-|D) is the marginal posterior for the parameter 
Gj . From the posterior samples (obtained either by MCMC or 
MultiNest) it is straightforward to obtain and plot the marginal 
posterior on the left-hand-side of Eq. (17) : since samples are 
drawn from the full posterior by construction, their density 
reflects the value of P(0|D). It is then sufficient to divide 
the range of Qj into a series of bins and count the number 
of samples falling within each bin, ignoring the coordinates 
values Qi (for i ^ j). The 2-dimensional posterior is defined 
in an analogous fashion. The 1 -dimensional, 2-tail symmetric 
a% credible region is given by the interval (for the parameter 
of interest) within a% of where the samples are found, ob- 
tained in such a way that a fraction (1 — a)/2 of the samples 
are outside the interval on either side. In the case of a 1-tail 
upper (lower) limit, we report the value of the quantity below 
(above) where a% of the samples are found. 

5. RESULTS 

5.1. Cosmic-Ray Propagation Model Parameter Constraints 



InFi gures[T]|2]we show 1 -dimensional marginalized poste- 
rior probabilities for the propagation and nuisance parameters 
of the model. The red cross represents the best fit, the ver- 
tical thin line the posterior mean, and the horizontal bar the 
68% and 95% error ranges (yellow/blue, respectively). Two- 
dimensional constraints on some parameter combinations are 
presented in Figure [3] The best-fit point and posterior ranges 
are summarized in Table [2 For reference the galdef parame- 
ter definition files with thebest-fit parameter values presented 
in this study are available as Supplementary Material to this 
paper. These give precise definitions of the model used, which 
can be reproduced as required. The final MCMC chains from 
which Figures [T]|3] were produced are also available. 

We see that 6 and vah are quite well constrained, with the 
posterior mean 6 = 0.31±0.02 being very close to the canoni- 
cal value of 1/3 for Kolmogorov diffusion. The Alfven speed, 
^Aif = 38.4 ±2.1 km s _1 , is higher than in earlier studies, but 
this is dictated by the fit to the ACE data on the B/C ratio at 
low energies. The posterior intervals on the values of Dq and 
z h are fairly large, D = (8.32 ± 1.46) x 10 28 cm 2 s" 1 and 
Zh = 5.4 ± 1.4 kpc. The typical value of 4 kpc adopted in 
many studies is at the lower end of the viable range, but still 
within the 95% interval, Zh £ [3.2, 8.6] kpc. We can see from 
the D vs. z/j panel in Fig.[3]that the diffusion coefficient and 
the halo size are positively correlated, as expected. 

Other parameters exhibit less pronounced correlations. The 
injection indexes v\ and V2 are tightly constrained and almost 
uncorrelated (Figure 13}, but this reflects the fact that the posi- 
tion of the injection spectral break is fixed in this analysis, so 
that the indices are determined by 5 and v^k with their narrow 
ranges. The value of the injection index V2 = 2.38 ± 0.04 
provides a consistency check as the value of the sum v% + S 
should be close to the spectral indices of directly measured 
carbon and oxygen spectra ^2.70, and indeed we find that 
V2 + S = 2.69 ± 0.05. Comparison to the value of the in- 
jection index V\ = 1.92 ± 0.04 shows that the spectral break 
required is 0.46 ± 0.05. The pdf for the spectral break is plot- 
ted in the bottom-right panel of Fig.fT] While at face value the 
break appears very statistically signmcant, it has be be kept in 
mind that the value found is dependent on the break energy, 



which was fixed in this analysis. Future analyses will allow 
more freedom in the form of the spectrum. 

5.2. Comparison with Our Previous Results 

In general, there is remarkabl e agreement between the "by- 
eye" fitting in the past (e.g., [Strong & Mo skalenko] |1998| 
[200T] |Moskalenko et aTl|2002| |Ptuskin et al.||2006| l ancTThe 
parameter constraints found using the refined Bayesian infer- 
ence analysis described in this paper. The posterior mean val- 
ues of the diffusion coefficient D = (8.32 ± 1.46) x 10 28 
cm 2 s _1 at 4 GV and the Alfven speed v A u = 38.4 ± 2.1 
km s _1 are also in fair agreement wit h earlier estimates o f 



5.73 x 10 28 cm 2 s^ 1 and 36 km s" 1 ( jPtuskin et al.||2006| ), 
respectively. The posterior mean halo size is 5.4 ± 1.4 kpc, 
also in agreement with our earlier estimated range Zh = 4 — 6 
kpc ( Strong & Moskalen ko|200 1 \ , although our best-fit value 
of Zh = 3.9 kpc is somewhat lower, due to the degeneracy 
between Do and z^. However, the well-defined posterior in- 
tervals produced by the present study are significantly more 
valuable than just the best fit values themselves as they pro- 
vide an estimate of associated theoretical uncertainties and 
may point to a potential inconsistency between different types 
of data. 

5.3. Quality of Best-Fit model 

We now assess the quality of our best-fit model. Define the 

X 2 as 



A', 



EE 



*1 



(18) 



i.e., we compute the x 2 using the rescaled error bars for the 
datapoints (notice that x 2 ^ — 2 log P(D|9, </>, r), i.e. the x 2 
is not minus twice the log-likelihood because of the pre-factor 
containing r appearing in Eq. (8)). There are N = 76 total 
data points and M = 16 fitted parameters, including both the 
modulation and the error rescaling parameters. Therefore the 
number of degrees of freedom (dof) is 60, and for the best- 
fit model we find x 2 = 69.3, which leads to a reduced chi- 
squared x 2 /dof = 68/60 = 1.15. This is not surprising, 
since by construction the error bar rescaling parameters, r, 
are adjusted dynamically during the global fit to achieve this. 
A more detailed breakdown of the contribution to the total x 2 
by data set is given in Table [3] 

The predictions for the fitted CR spectra of the best-fit 
model parameters are shown in Figures |4]|6] including an er- 
ror band delimiting the 68% and 95% probability regions. The 
species shown are B/C and 10 Be/ 9 Be ratios, and the spectra of 
carbon and oxygen. In each plot, we show the spectrum mod- 
ulated with the potential corresponding to our best-fit param- 
eters from our global fits for each of the data sets employed. 
We also show the datasets, each with error bars enlarged by 
the best-fit value of our scaling parameters, r, as given in Ta- 
ble!^] The yellow/blue band delimits regions of 68% and 95% 
probability, and is modulated according to the potential given 
in the each panel. We emphasize that the power of our statis- 
tical technique is such that we can, for the first time, provide 
not only a best fit model but also an error band with a well- 
defined statistical meaning. 

In order to better visualize the comparison of our best-fit 
model to the fitted data, we plot in the bottom part of each 
panel the best-fit residuals i.e., the difference between data 
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FIG. 1 . — ID marginalized posterior pdf normalized to the peak for the diffusion model parameters, with uniform priors assumed over the parameter ranges 
as in TablefT] The red cross represents the best fit, the vertical thin line the posterior mean, and the horizontal bar the 68% and 95% error ranges (yellow/blue, 
respectively)! The bottom-right panel shows the pdf for the spectral index break. 



TABLE 2 

Summary of constraints on all parameters 



Quantity 


Best fit 
value 


Posterior mean and 
standard deviation 


Posterior 
95% range 


Diffusion model parameters 












D (10 28 cm 2 s" 1 ) 


6.59 


8.32 ± 1.46 


[5.45, 11.20] 


S 


0.30 


0.31 ± 0.02 


[0.26,0.35] 


VAif (kms^ 1 ) 


39.2 


38.4 ±2.1 


34.2,42.7 


z h (kpc) 


3.9 


5.4 ±1.4 




[3.2,8.6] 




V\ 


1.91 


1.92 ±0.04 


[1.84,2.00] 


VI 


2.40 


2.38 ± 0.04 


[2.29,2.47] 


N p (1CT 9 cm 2 sr" 1 s" 1 MeV" 1 ) 


5.00 


5.20 ± 0.48 


[4.32, 6.23] 


Experimental nuisance parameters 










Modulation parameters <f> (MV) 












HEAO-3 


693 


690 ± 38 




613, 763 




ACE 


357 


354 ± 22 




311,398 




CREAM 


598 


602 ± 49 




503,697 




ISOMAX 


416 


430 ± 20 




391,470 




ATIC-2 


(fixed) 


N/A 




N/A 




Variance rescaling parameters r 












HEAO-3 


-0.60 


-0.60 ±0.10 


[-0.82,-0.411 


ACE 


-0.12 


N/A 


> - 


-0.49 (1-tail) 


CREAM 


0.00 


N/A 


> 


-0.53 (1-tail) 


ISOMAX 


-0.21 


N/A 


> - 


-1.21 (1-tail) 


ATIC-2 


-0.24 


N/A 


> 


-0.84 (1-tail) 
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FIG. 2. — As in Figure[T]but for the nuisance parameters used in the analysis. 
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FIG. 3. — 2D marginalized posterior probability distributions for some parameter combinations. The yellow and blue regions enclose 68 and 95% probability, 
respectively. The encircled red cross is the best fit, the filled green dot the posterior mean. 



and best-fit model, divided by the experimental error bar (en- 
larged by the correct error scaling parameter): 



n, 



(19) 



Visual inspection of the residuals for the B/C and the 
10 Be/ 9 Be ratios (see Figures [5] and [5} shows that our best-fit 
model gives an excellent fit to those data, with the distribu- 
tion of the residuals approximately symmetric around 0. This 
indicates that there is no systematic bias of our best-fit. The 
contribution to the overall x 2 from those data sets is, if any- 



thing, smaller than would be expected statistically: Table [3] 
indicates that each datum contributes about ~ 0.3 units to the 
X 2 - This could point to a degree of overfitting, or to our er- 
ror bar rescaling parameters being too small. However, the 
origin of this slight overfitting becomes clear when one con- 
siders the oxygen and carbon spectra, and their residuals (Fig- 
ures |6j. Residuals here are significantly larger, especially at 
low energies, E < 3 GeV, and the average contribution to the 
total x 2 by eac h datum is much larger, of order ~ 1.4, see Ta- 
ble[3] Therefore, the error bars on carbon and oxygen seem to 
require enlargement in order for our model to provide a good 
fit. Notice from the shape of the residuals in Fig. [6] that there 
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FIG. 4. — B/C ratio for our best fit parameters (dashed curves). Each of the dashed curves has been modulated with the best-fit potential from our global fits, 
with value given in the legend. We also plot the fitted datasets, each with error bars enlarged by the best-fit value of our scaling paramete rs, t, as given in Tab le[2| 
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TABLE 3 

Breakdown of contributions to the total x 2 of our best fit by data set 



CR 


Data sets 


Data points, n 


x 2 


X 2 /n 


Oxygen 


HEAO-3, ACE 


20 


28.9 


1.44 


Carbon 


HEAO-3, ACE 


21 


29.5 


1.40 


B/C 


HEAO-3, ACE, ATIC-2, CREAM 


29 


8.9 


0.30 


10 Be/ 9 Be 


ACE, ISOMAX 


6 


2.0 


0.33 


All 


All 


76 


69.3 


X 2 /dof = 1-15 



is no indication of systematic bias in our fit, i.e., the enlarge- 
ment of the error bars does not come about because the model 
cannot reproduce the data, rather, because the data themselves 
seem to show an amount of scatter that is incompatible with 
a smooth spectrum (unless the error bars are enlarged suffi- 
ciently). 

As a consequence, we can conclude that it is the carbon 
and oxygen spectra that are driving the value of the error 
bars rescaling parameters for HEAO-3 and ACE. Therefore, 
the error bars are correspondingly enlarged in the B/C and 



°Be/ 9 Be spectra, since we are using one single error bar 
rescaling parameter for each experiment. This results in much 
smaller residuals for the latter spectra. Introducing a larger 
number of error bar rescaling parameters, one for each exper- 
iment and for each CR species, and fitting them independently 
could resolve this issue. Then, the rescaling will be less im- 
portant for B/C and 10 Be/ 9 Be, leading to tighter constraints 
from those data sets. 
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FIG. 7. — Electrons (left panel) and positrons (right panel) spectra for our best fit parameters, with three choices of modulation. As for the antiproton spectrum 
and diffuse 7-rays, electrons and positrons spectra have not been used in the fit, therefore the lines should be interprete d as pr edictions from our model. We 
show experimental d ata on each quantity as well: elect rons - AMS- 01 ( Alcaraz et al. 2000 ), ATIC-2 IChang et al.|2 008), HESS JAharonian et al.|2008||2009) , 
positrons - AMS-01 ^Aguilar et al.|2007) , CAPRICE (Boezio et al.|2U00) , HEA1' (beatty et al.|2004 . The dates in the legend tor the data sets give the years 
when the corresponding data were collected. 
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Left panel: total leptons (positrons plus electrons ) spectrum for our best-fit CR parameters, w ith three choices of modulation potential. We also show 
}m fi?rm(-LAT (a sum of electrons and positrons, Abdo et al. 2009c, Ackermann et al. 2010). 



FIG. 8. 

the data from fi?rm('-LAT (a sum of electrons and positrons, Abdo et al. 2009c, Ackermann et al. 2010) . Right panel : correspo nding positron fracti on, with the 
same three choices of solar mo dulation potential. We show experimental data from PAMELA ^Adnam et al.|2009) , AMS01 jAguilar et al.|200"7) , CAPRICE 
jBoezio et al.|2 000l, and HEAT ( Beatty et al. 2004). These data sets have not been fitted in the analysis. 
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6. DISCUSSION 

6.1. Implications for Antimatter and j-rays 

We also use our best-fit model to calculate secondary an- 
tiprotons, electrons, positrons, and diffuse 7-rays which were 
not fitted, but provide a useful consistency check. For this 
calculation the sp ectra of CR proton s and He were adjusted 
to the BESS data (Sanuki et al. 2000). Secondar y antiprotons 
were calculated using the same formalism as in Moskalenko 



et al.| ( 2"002) . Calculation of secondary electrons, positrons, 
and diffuse 7-rays is described in Sec. [3] The spectra of sec 



ondaries are shown in Figs. [7]|9] The primary electron injec- 
tion spectrum is bas ed on fitting to pre-Fermi electron dat a 
(conventional model, [Strong et al.|2004||Ptuskin et al.|2 006), 
and is parameterized as a broken power law with indices 
1.6/2.5 below/above 4 GeV and a steepening (index 5) above 
2 TeV, and norm alized to the Fermi data at 25 GeV (Acker- 
|mann et al.|2Q10| >. The plot for the CR electron and positron 
spectrum is shown separately in Fig. [7] and compared to rele- 
vant measurements, while the plot for total leptons (electrons 
plus positrons) is displayed in the left panel of Fig. [8] Even 
though the total electron and diffuse emission data were not 
fitted, they agree well with our best-fit model predictions. The 
positron fraction, shown in the right panel of Fig. [8] does not 
agree with the PAMELA data (Ad riani et al.||2009| ), but this 
was expected since secondary positron production in the gen- 
eral ISM is not capable of producing an abundance that rises 
with energy. 

Antiprotons, shown in Fig. [9] also not fitted, present an in- 
teresting example where the intensity at a few GeV is sig- 
nificantly underpredicted by the reacceleration models. As 
has been already shown (Moskalen ko et al![2 002 2003}, the 
antiproton flux measurements by BESS taken during the last 
solar minimum, 1995-1997 ( |Orito et al.|2000| l, are inconsis- 
tent with reacceleration models at the 40% level at about 2 
GeV, while the stated measurement uncertainties in this en- 
ergy range are 20%. The reacceleration models considered are 
conventional models, based on local CR measurements, Kol- 
mogorov diffusion, and uniform CR source spectra through- 
out the Galaxy. Models without reacceleration that can re- 
produce the antiproton flux, however, fall short of explaining 
the low-energy decrease in the secondary/primary nuclei ra- 
tio. To be consistent with both, the introduction of breaks 
in the diffusion coefficient and the injection spectrum is re- 
quired, which may suggest new phenomena in particle ac- 
celeration and propagation. An inclusion of a local primary 
component at low energies, possib ly associated with the L o- 
cal Bubble, can reconcile the data (Moskale nko et al.|2003[ ). 

Figure |9] shows that the reacceleration model underpro- 
duces antiprotons in the GeV range by ^30% also compared 
to the PAMEL A data tak en during the current solar mini- 
mum (Adriani et al.||2010). On the other hand, high-energy 



PAMELA data agree well with the predictions, which may 
suggest that the excess over the model predictions at low ener- 
gies can be associated with the solar modulation. Since reac- 
celeration is also most important below a few GeV, at present 
it does not appear possible to distinguish these effects. How- 
ever, a more systematic analysis which includes evaluation of 
other propagation models may help and will be carried out in 
a future work. 

6.2. Comparison with other Analyses 

Most other an alyses have used ana lyti cal pro pagation mod- 
els, in particular |Donato et al.| ( |2002| and Maurin e t al.| ( |2.002| >. 
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FIG. 9. — Antiproton spectrum for our best-fit CR parameters, with three 
different representative solar modulation potentials together with recent data. 
Note that the error band has not been modulated. The modulat ed curve for 
$=50 MV is most appropriate for the BESS 19 95^1997 flight jOrito et al.| 
2000 1 and PAMELA current sol ar minimum da ta ( Adriani et al. 20101, while 
the BESS-Polar flight of 2004 (Abe et al.|2008) corresponds to the higher 
level of solar activity. The data shown here have not been fitted. 



The most recent rep orted results from this approach are in 
IMaurin et aT] ( |2010| i and |Putze et alT] ( |2010a| ), which used 
X z and MCMC techniques to analyze a wide range of semi- 
analytical models. The nearest case to ours is their diffu- 
sion/reacceleration model where they found 5 — 0.23 — 0.24, 
VAif = 70 — 80 km s _1 and Zh = 4 — 6 kpc. Better fits 
are found with convection included as well, with a smaller 
VAif, but this requires a very large S — 0.8 — 0.9. In con- 
trast, we find a more plausible range of values for the diffu- 
sion/reaccleration model (however we do not consider con- 
vection here). 

We also note that they assume an injection spectrum that 
is a single power-la w in rigidity, which also includes a factor 



of ^s, ~ji^p- v ( |Putze et al |2010b) . In the case of pro- 
ton and He injection spectra in the reacceleration model (their 
Model II) they use 775 « 1. This is equivalent to our break in 
the injection spectrum provided the diffusion coefficient has a 
form Eq. Q. For heavier nuclei, C to Fe, their injection spec- 
trum has rjs ~ 2, which compensates for the large values of 
the VAif in their fits. Therefore, their best fit parameters (e.g., 
«Aif) are not equivalent to ours and are dependent on a partic- 
ular choice of the injection spectrum. Their conclusion that 
they do not need a break in the injection spectrum is thus sig- 
nificantly overstated as they need an ad hoc factor (3 VS with 
index 775 different for different groups of nuclei. 

7. CONCLUSIONS 

For the first time, we have shown that a full Bayesian anal- 
ysis is possible using both MCMC and nested sampling de- 
spite the heavy computational demands of a numerical propa- 
gation code. Furthermore, our analysis also returns the value 
of the Bayesian evidence, which could be used to rank dif- 
ferent propagation models in terms of their performance in 
explaining the data. While we have not investigated this pos- 
sibility here, we leave this topic for a dedicated discussion in 
a forthcoming publication. 

The present study provides not just best-fit values for the 
propagation parameters but, more importantly, associated un- 
certainties which fully account for correlations among param- 
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FIG. 10.— Diffuse 7-ray spectra for 10° < |fe| < 20° for our best-fit CR 
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Feraii-LAT data alo ng with the unidenti fied background and source compo- 
nents are taken from Abdo et al. ( 2009b I. The red hatched area represents the 
systematic uncertai nty in the spectrum of the diffuse emission, as given in 
Abdo et al. 1 2009b 1. Note that our best-fit model corresponds closely to that 
used in Abdo et al. (2009b I to derive the unidentified background and source 
components. Therefore, the addition of our model to these components is 
valid. Note that these data have not been fitted. 

eters, as well as for experimental and theoretical uncertain- 
ties. Such error estimates have been fully propagated to the 
predicted CRs spectra from the model, thus providing an es- 
timate of the residual uncertainty on the predictions (after fit- 
ting), which can be used to assess, e.g., potential inconsisten- 
cies between different types of data or for model selection. An 
important conclusion is that the parameter ranges derived in 
this study are consistent with previous, less systematic analy- 
ses. 



A valuable test made possible by our technique is the con- 
sistency check of the calculated spectra of other CR species, 
antiprotons, electrons, and positrons with data. Total elec- 
trons agree with the Fermi-LAH spectrum within the system- 
atic uncertainties, see Fig. 



Galactic diffuse emission 



10 The calculated spectrum of the 
or the mid-latitude range is also in 
a good agreement with the observations by the ferm/-LAT. 
The antiprotons are under-predicted, which is a general fea- 
ture of reacceleration models as we have shown before. The 
positron fraction is inconsistent with a purely secondary ori- 
gin of positrons above 1 GeV, confirming the excess above 
10 GeV attributed to sources of primary positrons. On the 
other hand, the predicted positron spectrum is consistent with 
earlier experiments given their large error bars. We await pub- 
lication of PAMELA positron data for further model compar- 
ison. 

Because of the pioneering nature of our approach, we have 
concentrated on just one particular type of model with reac- 
celeration and a power-law diffusion coefficient. Other prop- 
agation modes, e.g., convection, are not considered here. Fu- 
ture work will consider these, together with constraints from 
antiprotons, positrons, and 7-rays. 
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APPENDIX 

A. SUPPLEMENTARY MATERIAL 

The Supplementary Material accompanying this paper contains the following files, which can be used to reproduce the results 
presented in the paper. 

• Galprop-chains.tar.gv Upon unpacking, this archive contains the 10 MCMC chains used in the paper, and one .info file 
detailing what information is contained in the chains. 

• The folder BestFitSpectra contains datafiles with the best-fit spectra from our paper. The files and their contents are 
described in an accompanying README file. 

• Galprop definition files are supplied for the best-fit parameter values as well as for the posterior mean values (as per 
Table [2). 
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